Automatic phoneme alignment based on acoustic-phonetic modeling

نویسنده

  • John-Paul Hosom
چکیده

This paper presents a method for speaker-independent automatic phonetic alignment that is distinguished from standard HMM-based “forced alignment” in three respects: (1) specific acoustic-phonetic features are used, in addition to PLP features, by the phonetic classifier; (2) the units of classification consist of distinctive phonetic features instead of phonemes; and (3) observation probabilities depend not only on the current state, but also on the state transition information. This proposed method is compared with a state-of-the-art baseline forcedalignment system on a number of corpora, including telephone speech, microphone speech, and children’s speech. The new method has agreement of 92.57% within 20 msec on the TIMIT corpus, which is a 26% reduction in error over the baseline method (with 89.95% agreement on TIMIT). Average reduction in error over all corpora is 28%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Automatic Phoneme Ali Acoustic-phonetic

This paper presents a method for speaker-independent automatic phonetic alignment that is distinguished from standard HMM-based “forced alignment” in three respects: (1) specific acoustic-phonetic features are used, in addition to PLP features, by the phonetic classifier; (2) the units of classification consist of distinctive phonetic features instead of phonemes; and (3) observation probabilit...

متن کامل

Deep Learning Techniques in Tandem with Signal Processing Cues for Phonetic Segmentation for Text to Speech Synthesis in Indian Languages

Automatic detection of phoneme boundaries is an important sub-task in building speech processing applications, especially text-to-speech synthesis (TTS) systems. The main drawback of the Gaussian mixture model hidden Markov model (GMMHMM) based forced-alignment is that the phoneme boundaries are not explicitly modeled. In an earlier work, we had proposed the use of signal processing cues in tan...

متن کامل

On the impact of phoneme alignment in DNN-based speech synthesis

Recently, deep neural networks (DNNs) have significantly improved the performance of acoustic modeling in statistical parametric speech synthesis (SPSS). However, in current implementations, when training a DNN-based speech synthesis system, phonetic transcripts are required to be aligned with the corresponding speech frames to obtain the phonetic segmentation, called phoneme alignment. Such an...

متن کامل

طراحی الگوریتم بازشناسی واجها با به کارگیری همبسته های آکوستیکی مشخصه های واجی

In the present paper, the phonological feature geometry of the Persian phonemes is analyzed in the form of articulate-free and articulate-bound features based on the articulator model of the nonlinear phonology. Then, the reference phonetic pattern of each feature that consists of one or a set of acoustic correlates, characterized by the quantitative or qualitative values in its phonological re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002